智能论文笔记

On the Utility Recovery Incapability of Neural Net-based Differential Private Tabular Training Data Synthesizer under Privacy Deregulation

Yucong Liu , Chi-Hua Wang , Guang Cheng

分类：机器学习

2022-11-28

Devising procedures for auditing generative model privacy-utility tradeoff is an important yet unresolved problem in practice. Existing works concentrates on investigating the privacy constraint side effect in terms of utility degradation of the train on synthetic, test on real paradigm of synthetic data training. We push such understanding on privacy-utility tradeoff to next level by observing the privacy deregulation side effect on synthetic training data utility. Surprisingly, we discover the Utility Recovery Incapability of DP-CTGAN and PATE-CTGAN under privacy deregulation, raising concerns on their practical applications. The main message is Privacy Deregulation does NOT always imply Utility Recovery.

translated by 谷歌翻译

Non-Stationary Dynamic Pricing Via Actor-Critic Information-Directed Pricing

Po-Yi Liu , Chi-Hua Wang , Heng-Hsui Tsai

分类： (统计)机器学习 | 机器学习

2022-08-19

本文介绍了一种新型的非平稳动态定价算法设计，定价代理面临不完整的需求信息和市场环境转移。代理商进行了价格实验，以了解每种产品的需求曲线和最大化价格，同时意识到市场环境的变化，以避免提供次优价的高机会成本。拟议的酸P扩展了来自统计机器学习的信息指导的采样（IDS）算法，以包括微观经济选择理论，并采用新颖的定价策略审核程序，以避免在市场环境转移后避免次优定价。拟议的酸P在一系列市场环境变化中胜过包括上置信度结合（UCB）和汤普森采样（TS）在内的匪徒算法。

translated by 谷歌翻译

Residual Bootstrap Exploration for Stochastic Linear Bandit

Shuang Wu , Chi-Hua Wang , Yuantong Li , Guang Cheng

分类： (统计)机器学习 | 机器学习

2022-02-23

我们为随机线性匪徒问题提出了一种新的基于自举的在线算法。关键的想法是采用残留的自举勘探，在该探索中，代理商通过重新采样平均奖励估算的残差来估算下一步奖励。我们的算法，随机线性匪徒（\ texttt {linreboot}）的残留bootstrap探索，从其重新采样分布中估算了线性奖励，并以最高的奖励估计拉动了手臂。特别是，我们为理论框架做出了一个理论框架，以使基于自举的探索机制在随机线性匪徒问题中脱颖而出。关键见解是，Bootstrap探索的强度基于在线学习模型和残差的重新采样分布之间的乐观情绪。这样的观察使我们能够证明所提出的\ texttt {linreboot}确保了高概率$ \ tilde {o}（d \ sqrt {n}）$ sub-linear在温和条件下的遗憾。我们的实验支持\ texttt {重新启动}原理在线性匪徒问题的各种公式中的简易概括性，并显示了\ texttt {linreboot}的显着计算效率。

translated by 谷歌翻译

Online Regularization towards Always-Valid High-Dimensional Dynamic Pricing

Chi-Hua Wang , Zhanyu Wang , Will Wei Sun , Guang Cheng

分类： (统计)机器学习 | 机器学习

2020-07-05

使用始终有效的在线统计学习程序设计动态定价政策是一个重要且尚未解决的问题。最现有的动态定价政策，重点关注所采用的客户选择模型的忠诚度，展示了在定价过程中调整学习统计模型的在线不确定性的有限能力。在本文中，我们提出了一种新颖的方法，可以使用理论担保设计基于动态定价策略的正规化在线统计学习。新方法克服了在线套索程序持续监测的挑战，并具有多种吸引人的财产。特别是，我们做出了决定性观察，即定价决策的始终有效性构建和茁壮成长在线正规方案。我们所提出的在线正则化计划将建议的乐观在线正常化最高似然定价（Oormlp）定价政策具有三大优势：将市场噪声知识编码为定价过程乐观;在线统计学习，以所有决策点的始终有效期以时间均匀的非渐近Oracle不等式信封预测误差过程。这种类型的非渐近推理结果允许我们在实践中设计更具样品有效和强大的动态定价算法。理论上，所提出的OormLP算法利用高维模型的稀疏结构，并在决策范围内确保对数后悔。通过提出一种乐观的在线套索程序，可以根据非渐近鞅浓度的新颖，提出解决过程级别的动态定价问题的乐观在线套索程序来实现这些理论前进。在实验中，我们在不同的合成和实际定价问题设置中评估OormLP，并证明OormLP推进了最先进的方法。

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

A Survey On Few-shot Knowledge Graph Completion with Structural and Commonsense Knowledge

Haodi Ma , Daisy Zhe Wang

分类：自然语言处理 | 人工智能 | 机器学习

2023-01-03

Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

RELIANT: Fair Knowledge Distillation for Graph Neural Networks

Yushun Dong , Binchi Zhang , Yiling Yuan , Na Zou , Qi Wang , Jundong Li

分类：机器学习

2023-01-03

Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.

translated by 谷歌翻译

Rethinking Mobile Block for Efficient Neural Models

Jiangning Zhang , Xiangtai Li , Jian Li , Liang Liu , Zhucun Xue , Boshen Zhang , Zhengkai Jiang , Tianxin Huang , Yabiao Wang , Chengjie Wang

分类：计算机视觉

2023-01-03

This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译